available online
Parametric RDT approach to computational gap of symmetric binary perceptron
We study potential presence of statistical-computational gaps (SCG) in symmetric binary perceptrons (SBP) via a parametric utilization of \emph{fully lifted random duality theory} (fl-RDT) [96]. A structural change from decreasingly to arbitrarily ordered $c$-sequence (a key fl-RDT parametric component) is observed on the second lifting level and associated with \emph{satisfiability} ($α_c$) -- \emph{algorithmic} ($α_a$) constraints density threshold change thereby suggesting a potential existence of a nonzero computational gap $SCG=α_c-α_a$. The second level estimate is shown to match the theoretical $α_c$ whereas the $r\rightarrow \infty$ level one is proposed to correspond to $α_a$. For example, for the canonical SBP ($κ=1$ margin) we obtain $α_c\approx 1.8159$ on the second and $α_a\approx 1.6021$ (with converging tendency towards $\sim 1.59$ range) on the seventh level. Our propositions remarkably well concur with recent literature: (i) in [20] local entropy replica approach predicts $α_{LE}\approx 1.58$ as the onset of clustering defragmentation (presumed driving force behind locally improving algorithms failures); (ii) in $α\rightarrow 0$ regime we obtain on the third lifting level $κ\approx 1.2385\sqrt{\frac{α_a}{-\log\left ( α_a \right ) }}$ which qualitatively matches overlap gap property (OGP) based predictions of [43] and identically matches local entropy based predictions of [24]; (iii) $c$-sequence ordering change phenomenology mirrors the one observed in asymmetric binary perceptron (ABP) in [98] and the negative Hopfield model in [100]; and (iv) as in [98,100], we here design a CLuP based algorithm whose practical performance closely matches proposed theoretical predictions.
- North America > United States > Colorado > Denver County > Denver (0.04)
- Africa > Sudan (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (10 more...)
Below are our responses to the comments
We would like to thank all the reviewers for recognizing the contributions of our work and providing valuable feedback. Below are our responses to the comments. ": we follow the kind suggestion from ": we would like to explain We will revise the paper accordingly and add more explanations to enhance the clarity and readability. Specifically, we use reference models trained on 1) ImageNet and 2) noisy CIFAR-10.1 images (with additive ": we explain that our method ": we appreciate the pointer to this contemporaneous work.
Neural Beamforming with Doppler-Aware Sparse Attention for High Mobility Environments
Vahapoglu, Cemil, O'Shea, Timothy J., Liu, Wan, Ulukus, Sennur
Beamforming has significance for enhancing spectral efficiency and mitigating interference in multi-antenna wireless systems, facilitating spatial multiplexing and diversity in dense and high mobility scenarios. Traditional beamforming techniques such as zero-forcing beamforming (ZFBF) and minimum mean square error (MMSE) beamforming experience performance deterioration under adverse channel conditions. Deep learning-based beamforming offers an alternative with nonlinear mappings from channel state information (CSI) to beamforming weights by improving robustness against dynamic channel environments. Transformer-based models are particularly effective due to their ability to model long-range dependencies across time and frequency. However, their quadratic attention complexity limits scalability in large OFDM grids. Recent studies address this issue through sparse attention mechanisms that reduce complexity while maintaining expressiveness, yet often employ patterns that disregard channel dynamics, as they are not specifically designed for wireless communication scenarios. In this work, we propose a Doppler-aware Sparse Neural Network Beamforming (Doppler-aware Sparse NNBF) model that incorporates a channel-adaptive sparse attention mechanism in a multi-user single-input multiple-output (MU-SIMO) setting. The proposed sparsity structure is configurable along 2D time-frequency axes based on channel dynamics and is theoretically proven to ensure full connectivity within p hops, where p is the number of attention heads. Simulation results under urban macro (UMa) channel conditions show that Doppler-aware Sparse NNBF significantly outperforms both a fixed-pattern baseline, referred to as Standard Sparse NNBF, and conventional beamforming techniques ZFBF and MMSE beamforming in high mobility scenarios, while maintaining structured sparsity with a controlled number of attended keys per query.
Binary perceptron computational gap -- a parametric fl RDT view
Recent studies suggest that asymmetric binary perceptron (ABP) likely exhibits the so-called statistical-computational gap characterized with the appearance of two phase transitioning constraint density thresholds: \textbf{\emph{(i)}} the \emph{satisfiability threshold} $α_c$, below/above which ABP succeeds/fails to operate as a storage memory; and \textbf{\emph{(ii)}} \emph{algorithmic threshold} $α_a$, below/above which one can/cannot efficiently determine ABP's weight so that it operates as a storage memory. We consider a particular parametric utilization of \emph{fully lifted random duality theory} (fl RDT) [85] and study its potential ABP's algorithmic implications. A remarkable structural parametric change is uncovered as one progresses through fl RDT lifting levels. On the first two levels, the so-called $\c$ sequence -- a key parametric fl RDT component -- is of the (natural) decreasing type. A change of such phenomenology on higher levels is then connected to the $α_c$ -- $α_a$ threshold change. Namely, on the second level concrete numerical values give for the critical constraint density $α=α_c\approx 0.8331$. While progressing through higher levels decreases this estimate, already on the fifth level we observe a satisfactory level of convergence and obtain $α\approx 0.7764$. This allows to draw two striking parallels: \textbf{\emph{(i)}} the obtained constraint density estimate is in a remarkable agrement with range $α\in (0.77,0.78)$ of clustering defragmentation (believed to be responsible for failure of locally improving algorithms) [17,88]; and \textbf{\emph{(ii)}} the observed change of $\c$ sequence phenomenology closely matches the one of the negative Hopfield model for which the existence of efficient algorithms that closely approach similar type of threshold has been demonstrated recently [87].
- North America > United States > Colorado > Denver County > Denver (0.04)
- Africa > Sudan (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (11 more...)
ResearchGPT: Benchmarking and Training LLMs for End-to-End Computer Science Research Workflows
Wang, Penghao, Zhou, Yuhao, Wu, Mengxuan, Qin, Ziheng, Zhu, Bangyuan, Huang, Shengbin, Zhao, Xuanlei, Zhang, Panpan, Peng, Xiaojiang, Shang, Yuzhang, Yang, Jianfei, Zhu, Zheng, Chen, Tianlong, Wang, Zhangyang, Wang, Kai
As large language models (LLMs) advance, the ultimate vision for their role in science is emerging: we could build an AI collaborator to effectively assist human beings throughout the entire scientific research process. We refer to this envisioned system as ResearchGPT. Given that scientific research progresses through multiple interdependent phases, achieving this vision requires rigorous benchmarks that evaluate the end-to-end workflow rather than isolated sub-tasks. To this end, we contribute CS-54k, a high-quality corpus of scientific Q&A pairs in computer science, built from 14k CC-licensed papers. It is constructed through a scalable, paper-grounded pipeline that combines retrieval-augmented generation (RAG) with multi-stage quality control to ensure factual grounding. From this unified corpus, we derive two complementary subsets: CS-4k, a carefully curated benchmark for evaluating AI's ability to assist scientific research, and CS-50k, a large-scale training dataset. Extensive experiments demonstrate that CS-4k stratifies state-of-the-art LLMs into distinct capability tiers. Open models trained on CS-50k with supervised training and reinforcement learning demonstrate substantial improvements. Even 7B-scale models, when properly trained, outperform many larger proprietary systems, such as GPT-4.1, GPT-4o, and Gemini 2.5 Pro. This indicates that making AI models better research assistants relies more on domain-aligned training with high-quality data than on pretraining scale or general benchmark performance. We release CS-4k and CS-50k in the hope of fostering AI systems as reliable collaborators in CS research.
Below are our responses to the comments
We would like to thank all the reviewers for recognizing the contributions of our work and providing valuable feedback. Below are our responses to the comments. ": we follow the kind suggestion from ": we would like to explain We will revise the paper accordingly and add more explanations to enhance the clarity and readability. Specifically, we use reference models trained on 1) ImageNet and 2) noisy CIFAR-10.1 images (with additive ": we explain that our method ": we appreciate the pointer to this contemporaneous work.
Journalists' Perceptions of Artificial Intelligence and Disinformation Risks
Peña-Alonso, Urko, Peña-Fernández, Simón, Meso-Ayerdi, Koldobika
This study examines journalists' perceptions of the impact of artificial intelligence (AI) on disinformation, a growing concern in journalism due to the rapid expansion of generative AI and its influence on news production and media organizations. Using a quantitative approach, a structured survey was administered to 504 journalists in the Basque Country, identified through official media directories and with the support of the Basque Association of Journalists. This survey, conducted online and via telephone between May and June 2024, included questions on sociodemographic and professional variables, as well as attitudes toward AI's impact on journalism. The results indicate that a large majority of journalists (89.88%) believe AI will considerably or significantly increase the risks of disinformation, and this perception is consistent across genders and media types, but more pronounced among those with greater professional experience. Statistical analyses reveal a significant association between years of experience and perceived risk, and between AI use and risk perception. The main risks identified are the difficulty in detecting false content and deepfakes, and the risk of obtaining inaccurate or erroneous data. Co-occurrence analysis shows that these risks are often perceived as interconnected. These findings highlight the complex and multifaceted concerns of journalists regarding AI's role in the information ecosystem.
- Europe > Spain > Basque Country (0.25)
- Europe > Spain > Galicia > Madrid (0.04)
- Oceania > Palau (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Personalized Constitutionally-Aligned Agentic Superego: Secure AI Behavior Aligned to Diverse Human Values
Watson, Nell, Amer, Ahmed, Harris, Evan, Ravindra, Preeti, Zhang, Shujun
Agentic AI systems, possessing capabilities for autonomous planning and action, show great potential across diverse domains. However, their practical deployment is hindered by challenges in aligning their behavior with varied human values, complex safety requirements, and specific compliance needs. Existing alignment methodologies often falter when faced with the complex task of providing personalized context without inducing confabulation or operational inefficiencies. This paper introduces a novel solution: a 'superego' agent, designed as a personalized oversight mechanism for agentic AI. This system dynamically steers AI planning by referencing user-selected 'Creed Constitutions' encapsulating diverse rule sets -- with adjustable adherence levels to fit non-negotiable values. A real-time compliance enforcer validates plans against these constitutions and a universal ethical floor before execution. We present a functional system, including a demonstration interface with a prototypical constitution-sharing portal, and successful integration with third-party models via the Model Context Protocol (MCP). Comprehensive benchmark evaluations (HarmBench, AgentHarm) demonstrate that our Superego agent dramatically reduces harmful outputs -- achieving up to a 98.3% harm score reduction and near-perfect refusal rates (e.g., 100% with Claude Sonnet 4 on AgentHarm's harmful set) for leading LLMs like Gemini 2.5 Flash and GPT-4o. This approach substantially simplifies personalized AI alignment, rendering agentic systems more reliably attuned to individual and cultural contexts, while also enabling substantial safety improvements. An overview on this research with examples is available at https://superego.creed.space.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Austria > Vienna (0.14)
- (3 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
CLuP practically achieves $\sim 1.77$ positive and $\sim 0.33$ negative Hopfield model ground state free energy
We study algorithmic aspects of finding $n$-dimensional \emph{positive} and \emph{negative} Hopfield ($\pm$Hop) model ground state free energies. This corresponds to classical maximization of random positive/negative semi-definite quadratic forms over binary $\left \{\pm \frac{1}{\sqrt{n}} \right \}^n$ vectors. The key algorithmic question is whether these problems can be computationally efficiently approximated within a factor $\approx 1$. Following the introduction and success of \emph{Controlled Loosening-up} (CLuP-SK) algorithms in finding near ground state energies of closely related Sherrington-Kirkpatrick (SK) models [82], we here propose a CLuP$\pm$Hop counterparts for $\pm$Hop models. Fully lifted random duality theory (fl RDT) [78] is utilized to characterize CLuP$\pm$Hop \emph{typical} dynamics. An excellent agreement between practical performance and theoretical predictions is observed. In particular, for $n$ as small as few thousands CLuP$\pm$Hop achieve $\sim 1.77$ and $\sim 0.33$ as the ground state free energies of the positive and negative Hopfield models. At the same time we obtain on the 6th level of lifting (6-spl RDT) corresponding theoretical thermodynamic ($n\rightarrow\infty$) limits $\approx 1.7784$ and $\approx 0.3281$. This positions determining Hopfield models near ground state energies as \emph{typically} easy problems. Moreover, the very same 6th lifting level evaluations allow to uncover a fundamental intrinsic difference between two models: $+$Hop's near optimal configurations are \emph{typically close} to each other whereas the $-$Hop's are \emph{typically far away}.
- Africa > Sudan (0.04)
- North America > United States > Colorado > Denver County > Denver (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (6 more...)
On the Inevitability of Left-Leaning Political Bias in Aligned Language Models
The guiding principle of AI alignment is to train large language models (LLMs) to be harmless, helpful, and honest (HHH). At the same time, there are mounting concerns that LLMs exhibit a left-wing political bias. Yet, the commitment to AI alignment cannot be harmonized with the latter critique. In this article, I argue that intelligent systems that are trained to be harmless and honest must necessarily exhibit left-wing political bias. Normative assumptions underlying alignment objectives inherently concur with progressive moral frameworks and left-wing principles, emphasizing harm avoidance, inclusivity, fairness, and empirical truthfulness. Conversely, right-wing ideologies often conflict with alignment guidelines. Yet, research on political bias in LLMs is consistently framing its insights about left-leaning tendencies as a risk, as problematic, or concerning. This way, researchers are actively arguing against AI alignment, tacitly fostering the violation of HHH principles.
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)
- South America > Brazil (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Questionnaire & Opinion Survey (0.46)
- Overview (0.46)
- Research Report (0.40)
- Government (1.00)
- Education (0.93)
- Health & Medicine > Therapeutic Area (0.48)